Sometimes it can be difficult to put all of the pieces together when they are dealt with separately, so here I want to demonstrate how we might approach analyzing a simple study in a Bayesian context. First, let’s explain the system and research question.
The system and question
My former M.S. student, Mitch Le Sage, was interested the role that scavenging invertebrates might play in the transmission and dynamics of Ranavirus epidemics. His general hypothesis was that by removing infectious carcasses, scavengers might reduce transmission rates among amphibian larvae in ponds.
One question that came out of this research was whether invertebrate scavengers, such as dytiscid beetle larvae, were capable of removing many amphibian carcasses. Eating one carcass in a lab setting is one thing, but what if there were a lot carcasses, as in a die-off? Would the scavengers be able to keep up?
Mitch and an undergraduate in the lab set up an experiment where dytiscid beetle larvae were house in small aquaria with 1, 2, 5, 8, or 10 carcasses of Ambystoma macrodactylum larvae to see how much was consumed in a 48 hour period. Since carcasses are scavenged, rather than consumed whole, he measured the starting and ending mass of the carcasses.
This sort of experiment falls reasonably well under the heading of “Functional Responses”, or perhaps you have heard of the “Holling Disc equations.” The history of these is pretty fun1, but for our purposes we can just consider two versions of how the foraging (or scavenging) rate increases with the availability of food (or carcasses).
First, we might simply assume that the foraging rate increases with the density of food. If there are more carcasses in the environment a scavenger will find one to consume more quickly. We could write:
\[
\mu_i = a \times c_i,
\] meaning that the average or expected amount of carcass mass consumed in a particular aquarium (\(\mu_i\)) increases linearly with the amount of carcass mass available (\(c_i\)) at rate \(a\). The parameter \(a\) is often called the “attack rate.” This is called, very creatively, the Type I functional response.
Secondly, we might assume that the rate of carcass removal saturates or levels off simply because a scavenger can only eat carcass so fast. At low carcass densities the scavenging rate is limited by the time it takes to find a carcass, but when carcasses are abundant the rate of removal is instead limited by handling time, \(h\). A classic way to frame this so-called Type II functional response2 is:
\[
\mu_i = \frac{a \times c_i}{1+a\times c_i\times h}
\] We wanted to see which of these models best described carcass removal and then to extrapolate our findings, including our parameter uncertainty, to the pond-level.
Simulating data from the Type I functional response
Let’s start not by looking at their data, but instead by making up some data. We’ll begin with just the linear model, \(\mu_i = ac_i,\) but can then repeat the pattern with with the second model.
First, our “predictor” data: The starting masses of ranged from a low of ~1/5th of a gram to ~10 grams, so we’ll simulate a data set over this range of carcass values. Since we used 18 aquaria in the study, we’ll use that many observations in our simulations, too.
n <-18# number of aquariacarc <-runif(n, min=0.2, max =10)carc
So these are some made up observations of initial carcass masses. There are two things that are quite unrealistic. First, we have a huge level of digits to the right of the decimal place, but that’s probably not something to worry about. Second, since we actually added 1, 2, 5, 8, or 10 carcasses in a tank it’s probably unrealistic to assume that all carcass masses are equally likely, even though the carcasses varied a bit in size. We could just run with this—it may not matter a lot—or we could more realistically simulate our carcass masses.
Indeed, looking back at the paper, we actually used different numbers of replicates for each number of carcasses, so let’s try to make this more realistic.
# actual numbers of carcasses used in each replicaten_carc <-c( rep(1, 5), rep(2, 5), rep(5, 3), rep(8, 3), rep(10, 2) )# now let's draw random carcass sizes and carc <-rnorm(n=18, mean =1*n_carc, sd =0.25) # mean of 1g/carccarc
That actually does a reasonable job of recapitulating the actual carcass masses!
Next, we need to simulate how much of the initial carcass mass is removed. Let’s pick an attack rate. You may not know much about what this should be, but you can at least make some basic assumptions. It cannot be greater than 1, as that would imply that more carcasses are eaten than available. It must be greater than zero. So we can at least say that \(1 \geq a \geq 0\). If we thought some more about the math and the system we could probably do better—for instance, only a fraction of the carcass is scavengable; the cartilage is probably not—but for a start, let’s just say \(a=1/2\).
a <-0.5# attack rate in g/48hmu <-0.5*carc
Let’s plot our new “data”.
plot(mu ~ carc)
And…it’s a line. I guess that shouldn’t be surprising, right? We just specified that it would be! But our data, that is our observations, should not follow exactly along this predicted or expected line, right? I mean, we’d expect a bit of noise both from measuring carcass remains3 and from difference in, among other things, the appetite of a dytiscid beetle larva, how good or bad it is at finding carcasses, and vagaries of where the carcasses end up relative to the beetle larva. Let’s therefore simulate observations from our expectation.
I would think that these slight variations are probably normally distributed. Unless we have a good reason to think otherwise, this is usually a reasonable starting point. We have our mean or expected value, \(\mu_i\), but we need to describe some variation from this expectation. As a start, let’s say \(\sigma = 1\).
obs <-rnorm(n, mean = mu, sd =1)plot(obs ~ carc, col ="red") # observationslines(mu ~ carc) # "expected" relationshipsabline(a=0, b=1, lty=2) # a 1:1 line just for context
That is a fair bit of variation! Moreover, some values are now going negative! That won’t work!
So what are our options? Well, we could turn down the noise (standard deviation or \(\sigma\)) to ensure that no observations are less than zero. We could also just set every observations that is less than zero to zero. However, both of these strategies seem like we are imposing arbitrary limits on the variability in observations. Plus, with ad hoc approaches like the second one, coding things gets hard (especially in the language of our Bayesian models)4 A better option would be to use a distribution of observation that ensured positive values.
One that comes readily to mind is the lognormal. Indeed, it is just a normal distribution that is exponentiated. Let’s see what that looks like:
# Normal distributionhist(rnorm(1e4), breaks =50)
# exponentiated normal distributionhist(exp(rnorm(1e4)), breaks =50, border ="blue", col =NA)# the lognormal distribution is the same as the exponentiated normalhist(rlnorm(1e4), breaks =50, border ="red", col =NA, add = T)
See?
Note, however, that converting between the standard deviation of the normal and lognormal is a bit weird. A standard deviation of 1 on the normal, linear scale, we would imply a standard deviation of \(\log(1) = 0\) on the log scale, but that would mean no deviation at all! Anyway, this is why we simulate, so we can get a sense of what these numbers imply
obs <-rlnorm(n, meanlog =log(mu), sdlog =1/2)plot(obs ~ carc, col ="red") # observationslines(mu ~ carc) # predicted lineabline(a=0, b=1, lty=2) # a 1:1 line just for context
So that’s a bit better, but perhaps too much noise. Let’s tone it down a bit.
obs <-rlnorm(n, meanlog =log(mu), sdlog =1/3)plot(obs ~ carc, col ="red") # observationslines(mu ~ carc) # predicted lineabline(a=0, b=1, lty=2) # a 1:1 line just for contextpoints(obs ~ carc, col ="blue") # new observations
Any points above the 1:1 line might be interpreted as measurement error on the initial or final carcass measurements. It’s probably not worth getting too worked about these issues5; we have data that are fairly realistic and avoid real issues like negative values.
Simulating across possible values of \(a\) (posterior prediction)
So far we have simply picked a single value of the parameters \(a\) and \(\sigma_{\log}\) (my fancy name for the sdlog parameter in the rlnorm() function call, above). But we do not necessarily know what these values are. Instead, we can simulate different data sets from different underlying expected relationships between the initial and consumed carcass mass based on different values of the parameters. Seem like a lot? It’s not too bad.
Let’s start by focusing on simulating a bunch of values of \(a\) and seeing what those imply about the expected relationship between initial and consumed carcass mass. We can keep our guess of 0.5 as maybe our mean, but allow these values to change a bit around this value.
as <-rnorm(n=20, mean=0.5, sd =0.25) # twenty values of a
Let’s see what these look like:
plot(x=carc, y=carc, type ="n") # just to set the scale of the graphabline(a=0, b=1, lty=2) # one-to-one line... slope is parameter bfor(i in1:20){abline(a=0, b=as[i], col ="gray")}
So we have twenty different possible values of the attack rate, \(a\), which is the slope of these lines. They cover a wide range of possibilities, from (in my simulation) almost zero to almost one, but most are in the middle. I guess without knowing any more than this, the results seem pretty OK to me. Now, however, we would like to simulate different data sets given these different possible values of the attack rate. Let’s start with 10 different data sets, though we could of course change this pretty easily.
We want to construct a data frame with ten data sets each with 18 observations that come from a different possible value of \(a\) and \(\sigma_{\log}\). Let’s build this up.
df <-data.frame(# our index of the data set numberd =rep(1:10, each =18),# the initial carcass mass; same as above, only we replicate# the n_carc vector 10 times, one for each data set.carc =rnorm(n=18*10, mean =1*rep(n_carc, times =10), sd =0.25) ) head(df)
hist(df$carc, breaks =30) # can see the peaks in carcass mass around 1-3, 5, 8, and 10 carcasses
This data frame so far just has our predictor or independent variable, initial carcass weights.
Now we need to match our values of \(a\) with our different replicate data sets, one value for each data set, and use that to
df$a <- as[df$d]
See what I did? as is a vector of possible values of the parameter \(a\). I want to use the first value in that vector for the first replicate data set, the second value for the second data set, and so on. Since the d column is the index of the replicate we can just use that to pull out the appropriate entry.
We can then calculate the expected value of the carcasses consumed, mu, for each value of carc using the replicate-specific value of a
df$mu <- df$carc*df$aplot(mu ~ carc, data = df) abline(a=0, b=1, lty=2) # one-to-one line... slope is parameter bfor(i in1:20){abline(a=0, b=as[i], col ="gray")}
So we can see that we now have 10 data sets with different initial carcass densities (\(x\)-values) and different slopes (values of \(a\)). However, we still do not yet have data sets of observations. For that, we need to simulate observations from the expected values given some level of noise. We were using a lognormal distribution, but above had just used a single value of sdlog =1/2 (\(\sigma_{\log}\)). We do not know that this is correct or right. Shouldn’t we explore the effect of different possible values of this parameter, too?
Since \(\sigma_{\log}\) must be greater than zero, but shouldn’t go to crazy high values, let’s assume it is exponentially distributed with a mean of 1/2. The exponential distribution has a single parameter, \(\lambda\), often called the rate parameter. The average of the distribution is simply \(1/\lambda\), so for a mean of 1/2 we should use rate = 2.
# possible values of sdloghist(rexp(n=1e4, rate =2), breaks =20)
We can use the same trick as above to assign a different value of \(\sigma_{\log}\) to each replicate data set.
If we plot these observations (in red) we can see what our parameters (\(a\) and \(\sigma_{\log}\)) say can happen:
plot(mu ~ carc, data = df) # expected values | aabline(a=0, b=1, lty=2) # one-to-one line... slope is parameter bfor(i in1:10){abline(a=0, b=as[i], col ="gray")}points(obs ~ carc, data = df, col ="red") # simulated observations
That is a lot, but it is clear that we sometimes get observations well above the one-to-one (dashed) line.
We can get a bit more a sense of this if we plot each data set separately.
# par(ask=TRUE) # <- asks you to hit return between each new graphfor(i in1:10){plot(obs~carc, data = df, type ="n") # for consistent axis limitspoints(obs ~ carc, data = df[df$d==i, ])abline(a=0, b=1, lty =2, )}
# par(ask=FALSE) # reset
We can see that the problems occur more often when we get larger values of sdlog used in simulating observations. We have a couple of choice. We could use a different distribution to describe the possible values of sdlog, which, again, is just saying how much variation we should see around the expected value, \(mu\). Options include the gamma or beta distributions, which don’t have such long tails. Our second choice would just be to say, you know, we’re making up stuff to see if our code works and to better understand what is reasonable; let’s file this away as sort of problematic, but probably not terribly important. I suggest we take option two for the moment so we don’t get too caught up in things that don’t matter a great deal. Besides, we can always come back to this if we want.
Simulating data from the Type II functional response
Again, our type II function response is of the form: \[
\mu_i = \frac{a \times c_i}{1+a \times c_i \times h},
\]
where \(h\) represents handling time, but everything else is the same. So this should be quick.
We can constrain handling time to be \(\geq 0\) from first principles, but I think we can also say that handling time is also unlikely to be super high, as in it is unlikely that handling time to consume a carcass is on the order of the time of the experiment (48h). We can therefore assume \(1 > h > 0\).
Let’s try simulating some values. I’m going to use the curve function, which is sort of like abline() except that it plots an arbitrary function instead of just a line.
# attack rate is 0.5, # handling time is 0.2 (20% of the duration of the experiment)# x is the initial carcass masscurve(0.5*x/(1+0.5*x*0.2), from =0, to =10)abline(a=0, b=1, lty=2)
It seems that this line curves away from the 1:1 line a bit, but not super fast. Play wit this a bit to get a sense of how handling time affects the shape of the curve. I think you’ll see that as \(h \rightarrow 0\) this line becomes more and more like the Type I functional response.
Now let’s allow for variation in \(h\). I’m going to use a beta distribution because it is bounded between zero and one and allows some control over the shape. I’m just tweaking the shape1 and shape2 parameters (sometimes called \(\alpha\) and \(\beta\) in more theoretical frameworks) until I get something that is skewed towards low values, but does not exclude higher values.
hs <-rbeta(n=10, shape1=1, shape2=3)plot(0:10, 0:10, type ="n") # set axis boundariesabline(a=0, b=1)for(i in1:10){curve(as[i]*x/(1+as[i]*x*hs[i]), add=TRUE, col ="gray")}
If we’re happy with these values, and I guess I am, we can then use them to generate expected values for any given initial carcass mass, like we did before with the Type I functional response. In fact, let’s just recycle what we have and add a column to the data frame for h, which we can then use to generate mu2 that we can use to generate obs2.
Note that we do not have to use the same values of carc, a, and sdlog, but doing so can be useful because we know what, precisely, is changed between mu and mu2, or obs and obs2, which can help us understand the consequences of different parameters or functional response.
As long as we kept track of the variable names, that was actually pretty simple, right?
plot(mu2 ~ carc, data = df) # expected values | aabline(a=0, b=1, lty=2) # one-to-one line... for(i in1:10){curve(as[i]*x/(1+as[i]*x*hs[i]), col ="gray", add =TRUE)}points(obs2 ~ carc, data = df, col ="red") # simulated observations
We still have a bit of trouble with some “observations” ending up with more carcass consumed than they started with—above the 1:1 line—but this is no worse than in our prior simulation of the type I functional response. Overall, it seems like this is working reasonably well!
From simulation to Bayesian model
So, as McElreath claims, and I have found, if you can simulate data you have essentially specified the components of a Bayesian model. Let me demonstrate, first with the Type I functional response model.
While we started with the parameters and then worked our way to observations in the simulation process, we specify the model in the reverse order, beginning with the description of the distribution of data. We simulated observations from a lognormal (using rlnorm()) given an expected value of \(\log(\mu)\) on the logarithmic scale, which I am going to call \(\mu_{\log}\) for consistency and transparency, and a standard deviation we called \(\sigma_{\log}\), so we write:
Now we developed our expectation, \(\mu_i\), from the Type I functional response, which was just the attack rate, \(a\), time the initial carcass mass, \(c_i\). Again, we can add this to our model as:
That is everything that describes our data. We simply need to specify our priors. We used distributions with particular parameters to simulate our data. We can simply use these same distributions and parameters as our priors!
\[
\begin{align}
Obs_i & \sim \text{LogNormal}(\mu_{\log_i}, \sigma_{\log}) \\
\mu_{\log_i} & = \log(\mu_i) \\
\mu_i & = a c_i \\
a & \sim \text{Normal}(0.5, 0.25) \\
\sigma_{\log} & \sim \text{Exponential}(2)
\end{align}
\] That is it. That is our Bayesian model, defined! Not so bad, right?
We can similarly define the second model for the Type II functional response:
Fitting the Bayesian model to the simulated data: Grid approximation
I’m sure you are still expecting this to be hard once we fit the model to our data. I’m here to assure you that it does not need to be very hard. We’ll keep this simple and use grid approximation. Since we have two parameters to worry about, things might look a bit more complicated, but it’s really quite similar. We just need to include the priors of both parameters in the numerator: \[
\Pr(a, \sigma_{\log} | obs) = \frac{\text{Pr}(obs| a, \sigma_{\log})\times \text{Pr}(a)\times\text{Pr}(\sigma_{\log})}{\text{Pr}(obs)}
\]
To do this calculation for realsies the steps are:
establish a (two dimensional) grid of values of our parameters, \(a\) and \(\sigma_{\log}\). This grid is what we are going to be calculating our likelihood and priors over, once for each point on the grid.
calculate the likelihood of observing the data given each combination of parameters (i.e., for each point on the two-dimensional grid)
multiply the likelihood by the prior probability of each combination of parameters to obtain the posterior-ish distribution.
standardize this posterior-ish distribution by dividing each entry by the sum of all of the values in the posterior-ish matrix
There are two complications. First, we’re going to use dnorm() and dexp() to do the calculations of the prior probability of each particular value of \(a\) and \(\sigma_{\log}\), respectively. Remember, these functions simply tell us the probability of any particular value given the values of mean and sd or rate that we supply, and we have already chosen the values of these for our priors! (Look above to refresh your memories if you need.)
Second, multiplying lots of small numbers (probabilities are less than one, after all) leads to all sorts of rounding errors. It turns out that it is much simpler to add the logs of numbers and then exponentiate the result. That is, \(a \times b \times c = \exp[\log(a) + \log(b) + \log(c)]\), and the version on the right-hand side is numerically better for a computer. So we’re going to do that.
OK, onward! But first, let’s specify a data set! We’ll just use the first of the 10 we generated.
obs <- df$obs[df$d ==1]carc <- df$carc[df$d ==1]
Now, step 1: establish a grid of possible values of our parameters using the expand.grid() function that constructs a matrix of every combination of the different vectors you give it.
pars <-expand.grid(a =seq(from=0, to=1, length.out=100), sdlog =seq(from=0, to=2, length.out=100))
Step 2: calculate the likelihood of observing the data given each combination of parameters. Actually, we’re going to calculate the log-likelihood for each observation and then add up the these log-likelihoods at each combination of parameters. Also, we’re going to use sapply() to do this for each row of our pars list.
LL <-sapply(X =1:nrow(pars), FUN =function(i) {sum(dlnorm(x=obs, meanlog =log(pars$a[i]*carc), #mu_logsdlog = pars$sdlog[i],log=TRUE# to calculate the log-likelihood ) ) } )
You might look at the head() and tail() of this LL vector we created. Notice that some values are -Inf. See what exponentiation these values produces. Does that make sense?
Step 3: multiply the likelihood by the prior probability of each combination of parameters to obtain the posterior-ish distribution. But remember, we’re working on the log-scale, so we want to calculate the log-probability of each value of each parameter across the grid and then add them to the log-likelihoods we just calculated.
We need to now exponentiate these log-probabilities, but we need to worry about very small values being rounded to zero, so McElreath’s solution is to first scale all of the entries by scaling all of the log-products by the maximum log-products. This means that what we get out is not exactly a probability distribution, but they are relative (or proportional to) posterior probabilities. I’m just going to trust him on this.
post <-exp(post_ish -max(post_ish))
Working with the posterior
Right…so what do we do with this? Well, we can inspect the parameter distributions. Indeed, we probably want to see if we recaptured the True values of the parameters, the ones we simulated from. This is a good internal check to make sure everything is work.
Interal consistency checks
The simplest way to do this is simply to draw out samples of the parameters relative to their (posterior) probability. With other engines for Bayesian models we’ll have our samples from the posterior as output, directly, but with a grid approximation we need to produce them. Let’s first draw rows (or index numbers) of the posterior distribution, post, in proportion their probabilities
We can then use these indexes to find the parameter values that are associated with these most probable parts of the distribution. This is like describing the shape of a mountain by taking samples of positions in space based on the height of those positions and then subsequently finding the coordinates of those points. Or short version, we now know the positions of the parameter data frame that has the most probable parameter values.
sample.a <- pars$a[samples] # see? indexing draws out the values of asample.sdlog <- pars$sdlog[samples]
Now we can plot histograms or density plots of these parameter samples. I’m going to ensure this covers the full range of values we considered in our grid approximation:
hist(sample.a, freq =FALSE, breaks =0:100/100)
Depending on the data set and underlying parameters, this posterior distribution may look truncated on its edge. That would be because we only considered possible values from zero to one, so the posterior might be truncated by the way we ran the model, not because of any weird parameter magic.
It can be helpful to see how the posterior differs from our prior expectation of the attack rate parameter, \(a\).
Notice how the data pulled our rather vague expectations about what the attack rate might be into a much more narrow, tall distribution about the values of \(a\) that are most consistent with the prior and data. Think of it as our golem learning from the data. It seems to have learned a lot!
But did it learn the right things? Recall that we actually know the True value of a used to simulate the data. Does our posterior capture this value? Let’s add in a vertical red line to show that.
Again, depending on the random values of a, the red line might not fall in the exact center of the posterior—because of the prior we assumed our golem was skeptical of values near zero or one (and beyond), even if those values were True—it should be included somewhere in there.
We can do the same for the other parameter, \(\sigma_{\log}\).
hist(sample.sdlog, freq =FALSE, breaks =0:200/100)curve(dexp(x, rate=2), add =TRUE)abline(v=sdlogs[1], col ="red")
Again, we can see how our golem learned a lot about the variation in the data and the range of values the golem thinks are consistent with the priors and data (i.e., the posterior distribution) should include the True value, even if it isn’t the most probable value.
The influences of parameters are often not independent of one another, and it can be useful to see the probability of combination of parameters; although not so much in this case. Or, in other words, it can be useful to plot the priors and posteriors together so we can see how they co-vary.
Working in base R graphics is a pain, but McElreath has some handy functions that help, for instance, this contour plot.
Note that since we didn’t have the prior probabilities of any combination of parameters in any particular place, we had to calculate them. We can also see how much our posterior changed from the prior.
We can also calculate means and modes and intervals, just like you often get for parameter estimates.
We can also plot the relationships between initial and consumed carcass densities that are most consistent with the priors and data (i.e., most likely in the posterior). In this case, it’s pretty simple because our expected value, \(mu_i\) is just a simple straight line, so I’ll use the short-cut of the abline() function.
plot(x=0:10, y=0:10, type ="l", lty=2)# some draws from the prior of afor(i in1:20){abline(a=0, b= as[i], col =rgb(0,0,1, alpha =0.2))}# draws from the posterior of afor(i in1:100){abline(a=0, b= sample.a[i], col =rgb(0,0,0, alpha =0.2))}
We can see that the relationship is much narrower than it might have been.
This is a bit, um, ad hoc. Why don’t we formalize a way to calculate the range of possible lines that are most consistent with our priors and data. That is, let’s find a consistency (or credible or confidence) envelope around the average line that includes, say, 90% of the posterior probability.
# Generate a range of values along the x-axisnew_carcs <-seq(0.1, 10, length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of the parameter, a possible_lines <-sapply(X = new_carcs, FUN =function(x) { x*sample.a[1:1e4] })
If you inspect this (e.g., head(possible_lines)) you will see that there are 50 columns representing the 50 positions along the x-axis, each with a y-value (expected carcass consumption) from the 10,000 draws from the \(a\) posteriors. Think of it as one line per row, each line (and row) corresponding to one sample from the posterior. That might be a bit too much to plot, but we can use a quantile function to find the 5th and 95th percentiles, or similar, among the observations at each point along the x-axis.
If you inspect this you see the range of the internal 90% of possible observations. However, it is wide rather than long. Thankfully, R makes it easy to transpose a matrix:
So now we have a data frame with all of the values we want to plot. The only issue is that it is hard to call variables with names like 5%, so we first need to rename those columns.
# rename 5% and 95% since R will choke on thosenames(pred_lines)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2) # 1:1 linelines(mean ~ carc, data = pred_lines, col ="red") # mean predictionlines(low ~ carc, data = pred_lines, col ="red", lty=2) #lowerlines(hi ~ carc, data = pred_lines, col ="red", lty=2) # upper# plot actual (simulated) observationspoints(x=carc, y=obs)
So in this figure the solid line is the predicted relationship between the initial and consumed carcass mass from the mean of the posterior distribution and the dashed lines represent the 5th and 95th percentiles of the posterior.
Posterior predictive checks
It can be useful to plot the range of likely values of the observations. After all, we might get the gist of the relationship right, but really miss the mark in the data-generating process. For instance, perhaps our actual (simulated) observations often lie outside of the range of what our golem expects. These more extreme observations would tend to pull our golem’s understanding to these extremes. More on this in the future. But for now, let’s see how we could produce and plot possible observations. It will follow a very similar path to our method of generating the consistency envelope.
# Generate a range of values along the x-axisnew_carcs <-seq(0.1, 10, length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_obs <-sapply(X = new_carcs, FUN =function(x) {rlnorm(n=1e4, meanlog =log(x*sample.a[1:1e4]), sdlog = sample.sdlog[1:1e4]) })
As above, if you inspect this (e.g., head(possible_obs)) you will see that there are 50 columns representing the 50 positions along the x-axis, but now each row is a possible set of observations corresponding to a draw from the posterior distribution of \(a\) and \(\sigma_{\log}\). Again, we can use the quantile() or PI() functions to find a particular interval of these observations.
interval <-apply(X=possible_obs, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and turn into a data frame.pred_obs <-as.data.frame(t(interval))#add in the x-values,pred_obs$carc <- new_carcs#and the predicted line based on the mean value of $a$ from the posteriorpred_obs$pred <-mean(sample.a)*pred_obs$carc# and plot this **posterior predictive interval** # rename 5% and 95% since R will choke on thosenames(pred_obs)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2) # 1:1 linelines(pred ~ carc, data = pred_obs, col ="red") # mean predictionlines(low ~ carc, data = pred_obs, col ="red", lty=3) #lowerlines(hi ~ carc, data = pred_obs, col ="red", lty=3) # upper# plot actual (simulated) observationspoints(x=carc, y=obs)
We should see that most, but not necessarily all of the observations fall within our 90% prediction interval. We should also not see systematic deviations from what the model says is likely, which would indicate some sort of bias or model misspecification. More on this later.
Hint
We can follow the same methods to simulate possible observations from the prior distribution, as well. It can be one thing to get a strong sense of the average or expected values based on prior values, but sometimes we completely flub the part of the model that specifies the probability of observing particular observations, making the noise much too small or much too large.
So at this point it seems that the model can reasonably recover the True values of the parameters (at least in my simulation) and seems to produce data that look like our (simulated) data. Does this work repeatedly for different data sets? It would be easy to check. Simply change the data set (e.g., df[df$d == 2, ]) and re-run the code. Ideally our model will work over a variety of possible data sets. If there are edge cases where things break, it can be useful to sort out why. Was it because of our priors? Were there limitations or implications of the way we produced expected values (i.e., \(\mu_i\)) that are coming to bite us? In essence, we want to have some confidence that our model is working how we think it should work, even in the edge cases.
Perspective
Is this a lot of work? Perhaps it is, but by this point we have a lot of confidence that we understand what we can expect and that our model is working as designed. It may feel like overkill in a simple case like this, but it is essential for more complex models where our intuition often fails us and simple inspection of parameter values yield minimal insights. So let’s practice in the simple case, getting a sense of the mechanics, before we move into less-charted waters. And anyway, the tools we (will) build to simplify things here should work in these more complex cases. If nothing else, I think working through these steps helps keep us in the driver’s seat of our analyses.
Repeating the process with the Type II functional response
We’ll repeat the process of producing, fitting, and checking our model with the Type II functional response. It will be quicker, but hopefully this repetition makes things feel more familiar. Again, the model is: \[
\begin{align}
Obs_i & \sim \text{LogNormal}(\mu_{\log_i}, \sigma_{\log}) \\
\mu_{\log_i} & = \log(\mu_i) \\
\mu_i & = \frac{a c_i}{1+a h c_i } \\
a & \sim \text{Normal}(0.5, 0.25) \\
h & \sim \text{Beta}(1,3) \\
\sigma_{\log} & \sim \text{Exponential}(2)
\end{align}
\]
We’ll use the observations from the first simulated data set as our data.
obs2 <- df$obs2[df$d ==1]
Step 1: establish a grid of possible values of our parameters using the expand.grid(). We now have three parameters to consider, \(a\), \(h\), and \(\sigma_{\log}\). Since we using 100 points along each axis we have \(100^3 = 1,000,000\) values to consider6.
Note that I am labeling everything with a 2 after it. This is just so we do not overwrite our prior model and related things, which allows us to work with both models. Just be careful copy-pasting!
Step 2: calculate the log-likelihood of observing the data given each combination of parameters.
LL2 <-sapply(X =1:nrow(pars2), FUN =function(i) {sum(dlnorm(x=obs2, meanlog =log(pars2$a[i]*carc/(1+pars2$a[i]*carc*pars2$h[i])), #mu_logsdlog = pars2$sdlog[i],log=TRUE# to calculate the log-likelihood ) ) } )
All we did was update our expected value based on the new functional form. Everything else is the same. Still, since we have 100x as many parameter combinations to consider I’m guessing that this operation took a few beats to calculate!
Step 3: multiply the likelihood by the prior probability of each combination of parameters to obtain the posterior-ish distribution.
post_ish2 <- LL2 +dnorm(pars2$a, mean=0.5, sd=0.25, log=TRUE) +dbeta(pars2$h, shape1=1, shape2=3) +# this is the only partdexp(pars2$sdlog, rate=2, log=TRUE)
Now we exponentiate these log-probabilities, first scaling all of the entries by by the maximum log-products.
post2 <-exp(post_ish2 -max(post_ish2))
Now it is time to find our samples from the posterior. (Be careful about the relabeling.)
We can also see the relationship between \(a\) and \(h\) in the posterior7.
# posteriorplot(sample2.a, sample2.h, pch=16, col =rgb(0,0,1, 0.05))
Notice that there is a positive relationship such that larger estimates of attack rates are associated with larger estimates of handling time. Can you think of why?
Let’s now plot our understanding of the relationships between initial and consumed carcasses most consistent with the prior and data.
plot(x=0:10, y=0:10, type ="l", lty=2)# some draws from the prior of afor(i in1:20){curve(as[i]*x/(1+as[i]*x*hs[i]), col =rgb(0,0,1, alpha =0.2), add = T)}# draws from the posterior for(i in1:1000){curve(sample2.a[i]*x/(1+sample2.a[i]*x*sample2.h[i]), col =rgb(0,0,0, alpha =0.05), add = T)}
Or, better, we can plot a consistency envelope around our predicted line.
# Generate a range of values along the x-axisnew_carcs <-seq(0.1, 10, length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_lines2 <-sapply(X = new_carcs, FUN =function(x) { x*sample2.a[1:1e4]/(1+x*sample2.a[1:1e4]*sample2.h[1:1e4]) })# at each position then find the upper and lower 90% intervalinterval_lines2 <-apply(X=possible_lines2, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to a data framepred_lines2 <-as.data.frame(t(interval_lines2))#add in the x-values,pred_lines2$carc <- new_carcs# and the predicted line based on the mean value of a & h from the posteriorpred_lines2$pred <-mean(sample2.a)*pred_lines2$carc/(1+mean(sample2.a)*pred_lines2$carc*mean(sample2.h))# plot posterior predictive interval# rename 5% and 95% since R will choke on thosenames(pred_lines2)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2) # 1:1 linelines(pred ~ carc, data = pred_lines2, col ="red") # mean predictionlines(low ~ carc, data = pred_lines2, col ="red", lty=2) #lowerlines(hi ~ carc, data = pred_lines2, col ="red", lty=2) # upper# plot actual (simulated) observationspoints(x=carc, y=obs2)
And finally, we can calculate and add a posterior predictive interval describing the range of observations most consistent with our prior and data.
# Same stuff from prior chunkplot(0:10, 0:10, type ="l", lty=2) # 1:1 linelines(pred ~ carc, data = pred_lines2, col ="red") # mean predictionlines(low ~ carc, data = pred_lines2, col ="red", lty=2) #lowerlines(hi ~ carc, data = pred_lines2, col ="red", lty=2) # upper# plot actual (simulated) observationspoints(x=carc, y=obs2)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_obs2 <-sapply(X = new_carcs, FUN =function(x) {rlnorm(n=1e4, meanlog =log(x*sample2.a[1:1e4]/(1+x*sample2.a[1:1e4]*sample2.h[1:1e4])), sdlog = sample2.sdlog[1:1e4]) })# at each position then find the upper and lower 90% intervalinterval2 <-apply(X=possible_obs2, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to a data framepred_obs2 <-as.data.frame(t(interval2))#add in the x-values,pred_obs2$carc <- new_carcs# plot posterior predictive interval# rename 5% and 95% since R will choke on thosenames(pred_obs2)[1:2] <-c("low", "hi")lines(low ~ carc, data = pred_obs2, col ="red", lty=3) #lowerlines(hi ~ carc, data = pred_obs2, col ="red", lty=3) # upper
The vast majority of that code is the same. So, again, as long as you keep track of what’s what, you don’t need to reinvent the wheel each time you construct or run a model.
Fitting each model to the other’s data
So far we have fit each model to the data it produced, as a check on how well it works. But in the end, we are hoping to fit both models to the same real data and see which model is most consistent with those real observations. So it seems useful to fit each model to observations produced by the other model. In particular, let’s try fitting the Type I model to the data produced by the Type II model to see if we would even notice any systematic issues. (We’ll come back to metrics of model fit, later.)
Let’s simply re-fit the Type I model to the Type II data.
Again, we can use the same machinery as before; we just need to be careful about making sure we’re using the right data and right parameters. So please pay attention to the variable names!
# Step 1# This is identical to the pars data frame above. # Just repeating it for clarity/completenesspars3 <-expand.grid(a =seq(from=0, to=1, length.out=100), sdlog =seq(from=0, to=2, length.out=100))# Step 2LL3 <-sapply(X =1:nrow(pars), FUN =function(i) {sum(dlnorm(x=obs2, # <- obs sim from Type II model meanlog =log(pars3$a[i]*carc), #mu_logsdlog = pars3$sdlog[i],log=TRUE# to calculate the log-likelihood ) ) } )# Step 3post_ish3 <- LL3 +dnorm(pars3$a, mean=0.5, sd=0.25, log=TRUE) +dexp(pars3$sdlog, rate=2, log=TRUE)# Step 4post3 <-exp(post_ish3 -max(post_ish3))
Then we can use the same approach as above to collect samples from the posterior:
hist(sample3.sdlog, freq =FALSE, breaks =0:200/100)curve(dexp(x, rate=2), add =TRUE)abline(v=sdlogs[1], col ="red")
So you might be seeing that the estimate of the \(\sigma_{\log}\) parameter is a bit higher than the True parameter. Can you imagine why? I would guess that this is because the data are deviating more from the straight line expectation of the Type I model since they actually come from a saturating relationship.
Probably what we’re most interested in is seeing whether the wrong model can generate predictions that are reasonably consistent with the actual (simulated) data. So let’s first plot these relationships between initial and consumed carcasses most consistent with the prior and data.
plot(x=0:10, y=0:10, type ="l", lty=2)# some draws from the prior of afor(i in1:20){abline(a=0, b=as[i], col =rgb(0,0,1, alpha =0.2))}# draws from the posterior for(i in1:1000){abline(a=0, b=sample3.a[i], col =rgb(0,0,0, alpha =0.05))}
And then let us see what sort of observations we would expect under this model.
# Generate a range of values along the x-axisnew_carcs <-seq(0.1, 10, length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_obs <-sapply(X = new_carcs, FUN =function(x) {rlnorm(n=1e4, meanlog =log(x*sample3.a[1:1e4]), sdlog = sample3.sdlog[1:1e4]) })# With this huge matrix, calculate interval at each x-valueinterval <-apply(X=possible_obs, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to data framepred_obs <-as.data.frame(t(interval))# add in x-values and predicted meanpred_obs$carc <- new_carcspred_obs$pred <-mean(sample3.a)*pred_obs$carc# rename 5% and 95% since R will choke on thosenames(pred_obs)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2) # 1:1 linelines(pred ~ carc, data = pred_obs, col ="red") # mean predictionlines(low ~ carc, data = pred_obs, col ="red", lty=3) #lowerlines(hi ~ carc, data = pred_obs, col ="red", lty=3) # upper# plot actual (simulated) observationspoints(x=carc, y=obs2)
So at this point it does seem our model can predict data that are consistent with our actual data, but it does that by assuming a large standard deviation around the expected (or mean) line. It also sort of looks like the model tends to under-predict carcass consumption at low amounts of initial carcass mass and over predicts consumption at higher initial masses. Let’s explore this deviation from the expectation to look for systematic bias, which would be indicative of a poorly-fitting model.
First, for each actual (simulated) observation, let’s generate a prediction, here based on the average value of \(a\) across the posterior
predicted <-mean(sample3.a)*carc
Then let’s plot the difference between the actual (simulated) observations and this prediction against the initial carcass mass.
plot(x=carc, y=obs2-predicted)abline(a=0, b=0, lty =2) # a line at zero to make this pattern clearer
So yes, my intuition about underpredicting for low initial masses and overpredicting for high initial masses seems to be supported. In other words, this Type I model is struggling to explain the observations simulated from the Type II model. This would be bad for our model if that was our goal, but it is good for our goal of correctly differentiating the “correct” functional relationship between carcass consumption and initial carcass mass.
This would be a more compelling example if we had done the same thing for the Type II model.
Right! So whatever systematic bias we observed when fitting the wrong model is not apparent here. It’s not exactly a tight fit in that there is still a lot of scatter around the line! It’s only that the scatter above versus below the zero line seems equivalent wherever we are along the x-axis.
We could also see the reverse, how well the Type II model predicts data simulated from the Type I process, but I’ll set that aside for now.
I’ll also set aside metrics of “fit” for now (because I know you’re all wondering how we can measure the fit of both models to the data). So more later. For now, onward!
Real data!
All of that was warm-up, model and tool development and testing. It was all meant so that we could be confident we knew how things worked. Now we are ready to fit the model to our actual data8!
We’ve done the hard work of building the code to fit the two models, so let’s simply recycle that code here, being sure we’re using the right variable names.
The type I model
I’m labeling everything with an I for the Type I model. Again, the code is essentially the same, following the steps of the grid-approximation routine:
# Step 1# This is identical to the pars data frame above. # Just repeating it for clarity/completenessparsI <-expand.grid(a =seq(from=0, to=1, length.out=100), sdlog =seq(from=0.01, to=2, length.out=100))# Step 2LL_I <-sapply(X =1:nrow(parsI), FUN =function(i) {sum(dlnorm(x=consumed, # the real data! meanlog =log(parsI$a[i]*initial), #mu_logsdlog = parsI$sdlog[i],log=TRUE# to calculate the log-likelihood ) ) } )# Step 3post_ishI <- LL_I +dnorm(parsI$a, mean=0.5, sd=0.25, log=TRUE) +dexp(parsI$sdlog, rate=2, log=TRUE)# Step 4post_I <-exp(post_ishI -max(post_ishI, na.rm=TRUE))
And finally we probably want to examine the predictions of the fit model against initial carcass mass as well as the posterior predictive distribution (i.e., what data would look like if it came from this model, given parameter and sampling uncertainty) and real (for real!) data.
# Generate a range of values along the x-axisnew_carcs <-seq(min(initial), max(initial), length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_linesI <-sapply(X = new_carcs, FUN =function(x) { x*sampleI.a[1:1e4] })# at each position then find the upper and lower 90% intervalinterval_linesI <-apply(X=possible_linesI, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to a data framepred_linesI <-as.data.frame(t(interval_linesI))#add in the x-values,pred_linesI$carc <- new_carcs# and the predicted line based on the mean value of a & h from the posteriorpred_linesI$pred <-mean(sampleI.a)*pred_linesI$carc# plot posterior predictive interval# rename 5% and 95% since R will choke on thosenames(pred_linesI)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2, xlab="Initial carcass mass (g)", ylab="Carcass mass consumed in 48h (g)") # 1:1 linelines(pred ~ carc, data = pred_linesI, col ="red") # mean predictionlines(low ~ carc, data = pred_linesI, col ="red", lty=2) #lowerlines(hi ~ carc, data = pred_linesI, col ="red", lty=2) # upper# plot actual (real!) observationspoints(x=initial, y=consumed)# NOW the posterior predictive interval# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_obsI <-sapply(X = new_carcs, FUN =function(x) {rlnorm(n=1e4, meanlog =log(x*sampleI.a[1:1e4]), sdlog = sampleI.sdlog[1:1e4]) })# With this huge matrix, calculate interval at each x-valueintervalI <-apply(X=possible_obsI, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to data framepred_obsI <-as.data.frame(t(intervalI))# add in x-values pred_obsI$carc <- new_carcs# rename 5% and 95% since R will choke on thosenames(pred_obsI)[1:2] <-c("low", "hi")lines(low ~ carc, data = pred_obsI, col ="red", lty=3) #lowerlines(hi ~ carc, data = pred_obsI, col ="red", lty=3) # upper
Whoo are those posterior predictive intervals wide!
Let’s also repeat our examination of the residuals (observed-expected).
predicted <-mean(sampleI.a)*initialplot(x=initial, y=consumed-predicted)abline(a=0, b=0, lty =2) # a line at zero to make this pattern clearer
So we see a very clear trend of underpredicting values of consumption at low initial carcass masses and overpredicting values of consumption at higher initial carcass masses. This is a model that is failing to produce the pattern in the data!
The type II model
We’ll follow the same workflow, just keeping the second model in mind and labeling things with II for the Type II model.
# Step 1# This is identical to the pars data frame above. # Just repeating it for clarity/completenessparsII <-expand.grid(a =seq(from=0, to=1, length.out=100), h =seq(from=0, to=1, length.out=100), sdlog =seq(from=0.01, to=2, length.out=100))# Step 2LL_II <-sapply(X =1:nrow(parsII), FUN =function(i) {sum(dlnorm(x=consumed, # the real data! meanlog =log(parsII$a[i]*initial/(1+parsII$a[i]*initial*parsII$h[i])), #mu_logsdlog = parsII$sdlog[i],log=TRUE# to calculate the log-likelihood ) ) } )# Step 3post_ishII <- LL_II +dnorm(parsII$a, mean=0.5, sd=0.25, log=TRUE) +dbeta(parsII$h, shape1=1, shape2=3) +dexp(parsII$sdlog, rate=2, log=TRUE)# Step 4post_II <-exp(post_ishII -max(post_ishII, na.rm=TRUE))# Get samplessamplesII <-sample(1:length(post_II), size =1e4, replace=TRUE, prob = post_II)sampleII.a <- parsII$a[samplesII] sampleII.h <- parsII$h[samplesII] sampleII.sdlog <- parsII$sdlog[samplesII]
Then we can examine the posteriors of the parameters.
That’s what we expected from our simulated data, so that’s nice.
Finally, let’s repeat the posterior predictive plot.
# Generate a range of values along the x-axis; same as abovenew_carcs <-seq(min(initial), max(initial), length.out=50)# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_linesII <-sapply(X = new_carcs, FUN =function(x) { x*sampleII.a[1:1e4]/(1+x*sampleII.a[1:1e4]*sampleII.h[1:1e4]) })# at each position then find the upper and lower 90% intervalinterval_linesII <-apply(X=possible_linesII, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to a data framepred_linesII <-as.data.frame(t(interval_linesII))#add in the x-values,pred_linesII$carc <- new_carcs# and the predicted line based on the mean value of a & h from the posteriorpred_linesII$pred <-mean(sampleII.a)*pred_linesII$carc/(1+mean(sampleII.a)*pred_linesII$carc*mean(sampleII.h))# plot posterior predictive interval# rename 5% and 95% since R will choke on thosenames(pred_linesII)[1:2] <-c("low", "hi")plot(0:10, 0:10, type ="l", lty=2,xlab="Initial carcass mass (g)", ylab="Carcass mass consumed in 48h (g)") # 1:1 linelines(pred ~ carc, data = pred_linesII, col ="blue") # mean predictionlines(low ~ carc, data = pred_linesII, col ="blue", lty=2) #lowerlines(hi ~ carc, data = pred_linesII, col ="blue", lty=2) # upper# plot actual (real!) observationspoints(x=initial, y=consumed)# NOW the posterior predictive interval# for each value along the x-axis, generate 10,000 possible observations# each based on a draw of a and sdlogpossible_obsII <-sapply(X = new_carcs, FUN =function(x) {rlnorm(n=1e4, meanlog =log(x*sampleII.a[1:1e4]/(1+x*sampleII.a[1:1e4]*sampleII.h[1:1e4])), sdlog = sampleII.sdlog[1:1e4]) })# With this huge matrix, calculate interval at each x-valueintervalII <-apply(X=possible_obsII, MARGIN =2,FUN = rethinking::PI, prob =0.9)# transpose and convert to data framepred_obsII <-as.data.frame(t(intervalII))# add in x-values pred_obsII$carc <- new_carcs# rename 5% and 95% since R will choke on thosenames(pred_obsII)[1:2] <-c("low", "hi")lines(low ~ carc, data = pred_obsII, col ="blue", lty=3) #lowerlines(hi ~ carc, data = pred_obsII, col ="blue", lty=3) # upper
And for fun, let’s add in the lines from the other model
plot(0:10, 0:10, type ="l", lty=2) # 1:1 linepoints(x=initial, y=consumed)# Type II modellines(pred ~ carc, data = pred_linesII, col ="blue") # mean predictionlines(low ~ carc, data = pred_linesII, col ="blue", lty=2) #lowerlines(hi ~ carc, data = pred_linesII, col ="blue", lty=2) # upperlines(low ~ carc, data = pred_obsII, col ="blue", lty=3) #lowerlines(hi ~ carc, data = pred_obsII, col ="blue", lty=3) # upper# Type I modellines(pred ~ carc, data = pred_linesI, col ="red") # mean predictionlines(low ~ carc, data = pred_linesI, col ="red", lty=2) #lowerlines(hi ~ carc, data = pred_linesI, col ="red", lty=2) # upperlines(low ~ carc, data = pred_obsI, col ="red", lty=3) #lowerlines(hi ~ carc, data = pred_obsI, col ="red", lty=3) # upper
That looks like a pretty good fit. Let’s examine our residuals (observations - expected) to look for biases.
predicted <-mean(sampleII.a)*initial/(1+mean(sampleII.a)*initial*mean(sampleII.h))plot(x=initial, y=consumed-predicted)abline(a=0, b=0, lty =2) # a line at zero to make this pattern clearer
That looks much, much less bad compared to the Type I model.
So, to recap, it looks like our data are better described by a Type II model, although the predictions of the two models overlap quite a lot over the full range of our data. That said, there appears to be a large handling time:
mean(sampleII.h)
[1] 0.6427687
rethinking::PI(sampleII.h, prob=0.9)
5% 95%
0.4040404 0.8787879
Since this is handling time in units of the duration of the experiment, which was 48h, we can multiply these numbers by 48h to estimate the handling time in hours (per gram).
mean(sampleII.h)*48
[1] 30.8529
rethinking::PI(sampleII.h, prob=0.9)*48
5% 95%
19.39394 42.18182
So that, from a biological point of view, seems substantial. We can use these estimates to make predictions about how many carcasses (or carcass mass) might be removed by scavengers under different data, having some confidence that a model that includes handling time is much more reasonable. However, we’ll have to return to the concept of model comparison later. For now, perhaps we can compare parameter estimates and think about, for instance, what the much larger estimate of \(\sigma_{\log}\) for Model I means.
To better understand how animals might forage, what strategies they would take, etc., he had undergraduate volunteers pick up discs of sand paper (I think… my recollection is imperfect) with tacks or their fingers while blind folded, etc., counting how many they could get in a certain amount of time. He noticed the empirical patterns, which others than built some equations to describe and even explain; I don’t think he was that mathematically inclined. Anyway, they’ve been hugely influential and useful, including here.↩︎
There are actually a lot of equations that produce more or less similar curves, starting with different assumptions. The Michaelis-Menton equation is one common approach. It just goes to show you that a hypothesis can be represented by multiple models and a model can be consistent with multiple equations.↩︎
We can do these things, but they’re beyond our current scope.↩︎
If we wanted to enforce our consumed carcass values to be \(\leq\) the initial carcass values we could model the proportion of the carcass consumed and use a beta distribution to describe the variation.↩︎
NB: using the same plotting functions as before caused…issues, so we’ll just do this hacky version for now. Since we’re not going to keep using grid-approximation for too much longer it’s not worth spending too much time how to circumvent these issues.↩︎
Actually, we had data from two ponds, but we have not yet seen how to deal with this issue, so we’ll begin with just the single pond.↩︎